Importing the basic libaries to start with understanding the dataset

Reading the Dataset

There are no missing values present in the dataset

Skewness is a measure of the degree to which the data distribution is skewed to the left or right. A negative skew indicates that the distribution is skewed to the left (tail is longer on the left), while a positive skew indicates that the distribution is skewed to the right (tail is longer on the right). The output shows the skewness value for each feature, where a value closer to 0 indicates less skew, negative values indicate left skew, and positive values indicate right skew.

By this we can infer that the customer Id is Unique to every Customer. Thus it will have no effect on the target Variable.

Creating BarPlots to understand their patterns for categorical data

We can see the count of people with similar surnames vary over a large scale ranging from 1 to 30 but as we can see the trend and pattern is not uniform, we can infer from this that the surname might not be a great factor affecting the excited target variable. We will decide to keep it or drop it after doing more analysis on it, By doing bivariate and trivariate analysis to understand more about it.

As we can see all the values of tenure lies between the specific range and their are no outliers in this data feature.

Here the majority of the chunk is of the age range of 30-50, the outliers age range goes from 60-90 as the value their is not in chunk

here the major credit score of the customer is in the range of 580-720 and the outlier range is quite low i.e. around 100-400

The range for balance is very continous with no abnormal value.